Sparse Additive Generative Models of Text
نویسندگان
چکیده
Generative models of text typically associate a multinomial with every class label or topic. Even in simple models this requires the estimation of thousands of parameters; in multifaceted latent variable models, standard approaches require additional latent “switching” variables for every token, complicating inference. In this paper, we propose an alternative generative model for text. The central idea is that each class label or latent topic is endowed with a model of the deviation in log-frequency from a constant background distribution. This approach has two key advantages: we can enforce sparsity to prevent overfitting, and we can combine generative facets through simple addition in log space, avoiding the need for latent switching variables. We demonstrate the applicability of this idea to a range of scenarios: classification, topic modeling, and more complex multifaceted generative models.
منابع مشابه
Voice-based Age and Gender Recognition using Training Generative Sparse Model
Abstract: Gender recognition and age detection are important problems in telephone speech processing to investigate the identity of an individual using voice characteristics. In this paper a new gender and age recognition system is introduced based on generative incoherent models learned using sparse non-negative matrix factorization and atom correction post-processing method. Similar to genera...
متن کاملScalable Text Mining with Sparse Generative Models
The information age has brought a deluge of data. Much of this is in text form, insurmountable in scope for humans and incomprehensible in structure for computers. Text mining is an expanding field of research that seeks to utilize the information contained in vast document collections. General data mining methods based on machine learning face challenges with the scale of text data, posing a n...
متن کاملImprovement of generative adversarial networks for automatic text-to-image generation
This research is related to the use of deep learning tools and image processing technology in the automatic generation of images from text. Previous researches have used one sentence to produce images. In this research, a memory-based hierarchical model is presented that uses three different descriptions that are presented in the form of sentences to produce and improve the image. The proposed ...
متن کاملExperiments with Generative Models for Dependency Tree Linearization
We present experiments with generative models for linearization of unordered labeled syntactic dependency trees (Belz et al., 2011; Rajkumar and White, 2014). Our linearization models are derived from generative models for dependency structure (Eisner, 1996). We present a series of generative dependency models designed to capture successively more information about ordering constraints among si...
متن کاملCorpora and exemplars in phonology
ionist models typically work with sparse lexicons, with as only exception in generative grammar the work of Jackendoff (1975). Once the linguistic generalizations of the language have been deduced from the input, the input words are no longer needed to support the generalizations. If they are morphologically complex and completely regular in all respects, they can even be removed from the lexic...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011